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ABSTRACT: 

We develop some two-person zero-sum game formulations of search 
and evasion problems. By employing a game theoretic approach, we 
allow the hider, as well as the searcher, to choose a strategy. This is 
in contrast to most search models which assume a stationary or passive 
hider. Both non-sequential.and sequential search games are investigated. 
Some interesting aspects of the non- sequential game and an example of 
an antisubmarine search problem are given. The sequential games con- 
sist of a sequence of moves. When the players move, they not only de- 
termine a payoff but also the probability that the game terminates before 
the next move. When at most a finite number of moves is allowed, we 
prove that a solution may be found by solving a recursive sequence of 
matrix games. When the number of moves is not bounded, the game is 
characterized by a special type of non-linear program. The solution 
to this program can be approximated by successive perturbations of a 
related linear program. Finally, we obtain the result that a pair of 
strategies minimaxes the expected duration of the game if and only if 
these strategies also maximin the probability of termination in one step. 
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1. INTRODUCTION 



We investigate some two-person zero-sum games which typically 
arise in search and evasion settings. One of the classical problems in 
the theory of search is to determine an optimal division of search effort 
among n-cells. ^ We explore this problem when both the searcher (PI) 
and the hider (P2) can choose a cell on each move. Here we are allowing 
for the active participation of the hider as opposed to other formulations 
which require a stationary hider. An n-cell search game, as we call it, 
and two sequential extensions of the n-cell game are proposed. 

We summarize our game formulations and results. In the n-cell 
game, every play consists of exactly one move. The payoff function is 
taken to be the probability that PI detects P2. This payoff is particu- 
larly appealing for Antisubmarine Warfare (ASW) applications. With 
the indicated payoff, a zero-sum assumption corresponds to the role of 
an evader for P2. To illustrate this point, an example of a typical ASW 
situation is given. We also show how a constrained-game extension can 
be employed to include additional tactical information. 

The n-cell game is extended to a sequential game. In turn, we con- 
sider sequential games consisting of a finite number and an infinite number 
of moves. On each move, the players not only determine -an immediate 
payoff, but they also determine the probability that the game terminates 

^For an example, see Koopman [9] and Bellman [1]. 
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before the next move. This game may be thought of as a matrix game 
which is played again with probability determined by the players. 

For the finite sequential game, we show how optimal strategies and 
the value may be computed by dynamic programming. In particular, the 
solution of the game can be found by solving a recursive sequence of 
matrix games. The amount of computational effort is usually much less 
than would be required by solution of the game in normal form. 

When an infinite number of moves is allowed, we show how to 
characterize the resulting sequential game by a special type of non- 
linear program. If one of the variables in the constraint set of this 



non-linear program is held fixed, it becomes a linear program. To 
find the solution of the sequential game, we must adjust this variable in 
the constraint set to make the optimal value of the objective function equal 
to zero. We show how to perturb the linear program and thereby ap- 
proximate the game solution to within desired accuracy. 

Many of the search models which appear in the literature assume a 
stationary hider. Models of this type are given by Koopman [9] , 

Bellman [1], Pollock [15], Dobbie[7], and MacQueen [ 12] . On the 
other hand, game formulations which allow the hider to choose strategies 
are presented by von Neumann [ 19] , Norris [14], and Neuts [13]. Our 
games are generalizations and extensions of these search games. In 
particular, one of von Neumann's [19] search games is a special case 



of the n-cell search game which is presented in section 2. 
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2. THE N-CELL SEARCH GAME 
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2. 1 Formulation 

To formulate the n-cell search game, we assume that the searcher 
has one detection device and that there is only one hider. Later, we 
relax these assumptions. The search region of interest is divided into 
n-cells. A pure strategy for the searcher (PI) is a cell to search (locate 
his detection device) and a pure strategy for the hider (P2) is a cell in 
which to hide. A play consists of exactly one simultaneous choice of 
strategies (move) or, of course, the players may choose their strate- 
gies sequentially provided the second choice is made in ignorance of 
the first. 

To define the payoffs, we assume that if PI looks in cell i and P2 
hides in cell j , then PI detects P2 with probability a_ (i, j = 1, . . . , n) . 
Let A be the nxn payoff matrix A = (a_) . 

. To embrace tactical encounters, we have postulated the payoff as a 
probability of detection. Other search payoffs could be used as well. 

We have also allowed the probability of detection to be a function of the 
range between PI and P2. This feature is not included in most other de- 
tection models. 

Now, suppose that PI searches cell i with probability x. , and 
suppose that P2 hides in cell j with probability y.. We require that 

] 

i 
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n 

Lx. = 1, x. S: 0 , i = 1 , . . . , n , 
i=l 1 

' n 

y . - 1 » y . ^ 0 > j - 1 i • » • » n • 

j=i J J 

Let X and Y be the nxl vectors X = (x. , ... , x ) and Y = (y i , . . . , y ) 

In In 

Then X and Y are mixed strategies for PI and P2, respectively. If PI 

chooses X and P2 chooses Y then, from elementary probability, PI de- 

t 1 

tects P2 with probability X AY. We assume that PI chooses X to maxi 

t 

mize his detection probability X AY. Now consider the case when P2 

t 

chooses Y to minimize X AY. Then we can interpret the motive of P2 
as evasive action since P2 is attempting to minimize the probability that 
he is detected. This action by P2 also gives rise to a zer.o-sum game. 

It is important to note this relationship between evasion and a zero-sum 
game. We henceforth restrict our discussion to zero-sum games. 



1 



X denotes the vector X- transpose 
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2. 2 An Example 

To illustrate the n-cell game, consider the following tactical example. 
Suppose a submarine must pass through a channel to get from its base to 
its patrol area. * The searcher,' PI, wishes to locate a detection device in 
the channel to detect submarines as they pass through. For convenience, 
we assume that the search region is divided into 15 cells, as shown in 
Figure 1. We assume that PI wants to locate his device in one of these 
cells to maximin the probability of detection. 




Figure 1 

This type of situation was encountered in the Bay of Biscay during 
World War II, Sternhell and Thorndike [18]. 

• 
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To obtain the payoff matrix, we need the detection probabilities. 
These probabilities are found from the probability of detection versus 
range curve, as shown in Figure 2. 



1 




Probability of Detection versus Range 
Figure 2 



To find the payoff matrix, suppose that PI locates his detection device 
in cell 5, and P2 hides in cell 8; then the range is three cells, and the 
probability of detection, from Figure 2, is a _ = 0. 367. The other ele- 
ments of the payoff matrix A are determined in a similar manner, and the 
complete matrix is given in Figure 3. 
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All blank elements are zeros. 
Player l's Payoff Matrix 



The value and optimal strategies (solution) of the game can now be 



computed. The solution was computed by linear programming, and it is 
tabulated in Figure 4. From this figure, we observe how the boundaries 
of the search region affect the searching strategy. A substantial amount 
of the effort is allocated to the end cells. Notice that the ’only data re- 
quired for this search model is a probability of detection versus range 
curve. 
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2. 3 Extensions 



Before leaving this subject, we point out some possible extensions. 

Additional probabilistic information can be included in the n-cell game 

by considering a constrained game extension. The elegant development 

of a constrained game by Charnes [2] can then be applied directly. To 

illustrate the type of constraints which may arise, suppose that the 

searcher can bound the probability that the hider chooses certain cells, 

i. e. , the searcher determines numbers L. and U. (0 £ Li. £ U. £ 1) such 

J J J J 

that 

(1) L. £ y. £ U. . 

J J J 

These bounds may arise from intelligence or previous contacts. 
Constraints of the type in equation (1) and, in general, any linear ine- 
qualities can be included in a constrained game formulation. The method 
of Charnes [2] can then be employed. 

It is also desirable to relax the assumption that PI has only one de- 
/ tection device. This can be easily accomplished by redefining Pi's pure 

/ 

strategies in the following way. For simplicity, suppose PI has two de- 
tection devices. Then let each pure strategy for PI be the two -tuple (k, i) 
where cell k denotes the location of device 1 and cell i is the location of 
device 2. Now the probability of detection can be calculated for each such 
pure strategy, and again an ordinary matrix game is obtained. In a 
similar way, we can also allow P2 to consist of two or more hiders. 

Next/ we formulate a finite sequential version of the n-cell search game. 
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3. THE FINITE SEQUENTIAL GAME 



3. 1 Formulation 

First, we discuss the elements of the finite sequential game, and 



the game has' not terminated, the players are faced with a two-person 
zero-sum game. In our formulation, we shall use the n-cell game as the 
two -person zero-sum game for each move. When the players move, they 
each choose a strategy which determines a zero-sum payoff from player 2 

to player 1 and a probability that the game terminates before the next 

% 

move. We wish to find an optimal strategy for each player which mini- 
maxes the expected accumulated payments received by player 1. 

The recursive optimization technique which we will propose has also 
been discussed by other authors. Kuhn [10] (19 53) gave his theorem on 
games of perfect recall which paved the way for further work. Shapley 
[17] (1953) was the first to point out the recursive character of a gener- 
alization of this game, although he did not deal with the finite case. Later 
contributions were made by Bellman [1] (19 57), Everett [8] (19 57), 
Zachrisson [ 22] (1964), and Denardo [ 6] (1965). 

The payoffs and continuation probabilities are now specified. Suppose 
that PI searches cell i and. P2 hi<jps in cell j on move r. Then the payoff 



then we proceed with the mathematical formulation. A play of the game 



consists of, at most, a finite number (N) of moves. On each move, when 




j 

1 

I 



I 



from P2 to PI is 



a.. (r) i , j = 1, . . . , n 

r = 1 , . . . , N . 

Also, when PI searches cell i and P2 hides in cell j on move r, the game 

continues until move r + 1 with probability 

p. . (r) i , j = 1, . . . t n 
ij 

r = 1, . . . , N - 1 . 

We let A be the nxn matrix A = (a..(r)) and P the nxn matrix 
r r ij r 

P = (p. ( r ) ) . Hence, A is Pi’s payoff matrix for move r and P is the 
r ij r r 

matrix of continuation probabilities for move r. We assume that the game 
is zero sum and that PI is the maximizing player. 

Next, we consider strategies for the players. We have assumed that 
the continuation probability and payoff depend only on the choices available 
for a particular move. It follows that the game is one of perfect recall as 
defined by Kuhn [10]. Kuhn's theorem for a game of perfect recall asserts 
that a "behavior strategy" is optimal. For this particular game, a behavior 
strategy takes the following form: let and be mixed strategies over 
the alternatives available on move r for PI and P2, respectively. Let 
X = (X^ , . . . , X^.) be an N-tuple of the above mixed strategies for PI. 

Then X is a behavior strategy for PI.* Similarly, we define a behavior 
strategy, Y = ( Y^ , . . . , Y ), for P2. Now the following sets of strategies 
are introduced: 

= (X r J, X r = £Y r ], £X), X. £ Y J . 

» 
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From Kuhn's theorem of perfect recall, the sets I X and X contain optimal 

game strategies for PI and P2, respectively; and we will, therefore, 

limit our search for optimal strategies to these sets. 

The total expected payoff for PI will be expressed in terms of fixed 

strategies XS 2 , Y e X and the given information. If PI chooses the 

strategy X s X for move r and P2 chooses Y 8 T , then the payoff 
° r r r r 

to PI for move r is 



X t A Y r = 1, . . . , N 
r r r 

and the game continues until move r + 1 with probability 

X t P Y r = 1, . . . , N - 1 . 
r r r 

Now the product of the probability that the game continues until move r 
and the payoff for move r is 



X t A Y 
r r r 



r - 1 

n 

h=l 





r = 2, 3, . . . , N . 



The expected accumulated payoff for N moves, v^ (X, Y) , is the sum 
of the above terms 



(2) (X, Y) = X* A, Y, + t X* A r Y r 7 X^ P h Y h 

r = 2 h= 1 



Since the game has a finite number of moves and a finite number of 
strategies, it must have a value and optimal strategies. ^ Recall that the 



sets X and ^ contain optimal strategies. T1 
v (X, Y) has at least one saddle point over th 

X U 

vonh^umann and Morgenstern [21], 



ll 



before, the function 
sets X and X. 
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3. 2 Recursive Solution 



We show how to compute the minimax of equation (2) by a recursive 
technique. Let and denote the sequences of mixed strategies 

X r = * X r * X r + 1 ’ * ' ' ’ X N ) 



r = 1, . . . , N . 



Y = (Y ,Y Y ) 

r r r + 1 N 



Of course, we have = X , and = Y . We rewrite equation (2) and 

also define the scalar functions v (X , Y ) by 

r r r 

(3) v r <5C r .Y r >.^A r Y rt (X , r P l Y r )v r+1 (X r+1 .Y r+l ) r - 1 N 

V N + 1 *° 

Now v ( X , Y ) may be interpreted as the expected accumulated pay- 
r r r 

ments received by PI on the last N - r + 1 moves of the game. 

Equation (3) leads us to believe that the recursive optimization tech- 
nique of dynamic programming can be employed. We will establish this 

* * 

fact by theorem 1. We define v^ , X^ , Y^ by the following equations 



(4) 0 = 



Max 

X si 
r r 



Min 

Y s I 
r r 



r X t A Y + (X* P Y ) v ,1 
l_r r r r r r r+lj 



r = 1, 



N + 1 



. . , N 

= 0 



A f A ^ t A 

= X A Y + ( X P Y ) v , . 
r r r r r r r + 1 

The minimax theorem of von Neumann [21] establishes the existence of 

A /v 

X , Y , v as defined by equation (4). The following theorem then relates 
r r r 

the solutions of equation (4) to the solutions of the sequential game. 
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Theorem 1 : is. the value of the sequential game, and 



X = ( X , • , x^) , Y = ( Y , . . • , Y ) are optimal strategies for PI 



and P2, respectively. 



Proof: Since v^ (X, Y) from equation (3) is the expected payoff 



function for the sequential game, a necessary and sufficient condition for 



A A 



v^ to be the value of the game and X, Y optimal strategies is 



v (X, Y) £ v S v (X, Y) all X6 X and Y e X . 



A /S 



We shall show that the above condition is satisfied by v^ , X, Y as de- 



fined by (4). From (4), we have 



(5) X 1 AY + (X* P Y ) v , S v SX l AY +(X t P Y)v , , 

r r r r r r+1 r r r r r r r+1 



all X c X , Y c X 
r r r r 



We begin an inductive argument 



V N (X n- V * X N A N Y N * V allY N 



assume 



v , , ( X , Y , ) s v , for some r and all Y 

r+1 r+1 r+1 r+1 r 



By definition, 



v(X,Y) = X t AY+(X t PY)v . ( X , , Y , ) 
r r r . r r r r r r ' r + 1 ' r+1 r+1 7 



By the inductive assumption and X P Y s 0 , 

r r r 

0 



v (X , Y ) s X t A/I Y + (X 1 P Y ) v 
r r r ' r {:) r 'rrr'r+l 



all Y 



. 



.1 

i 






'll 
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From the preceding equation and equation (5) 



•v ( X , Y ) a v all Y 
r r r r 



r 



Hence, by induction on r 



v^X, Y) a Vj all Y e Y . 



Similarly, we may establish 



y (X, Y) s Vj alixe X ; 



therefore, 



(X, Y) * v £ (X, Y) all Xe J, Ye X 



and the theorem is true. 



In order to find the value and optimal strategies, we can solve 
equation (4) recursively. But, each iteration of equation (4) requires the 
solution of an ordinary matrix game. Now the solutions (value and optimal 
strategies) of a matrix game can be found by solving a linear programming 

/ 

/ 2 

formulation of the game. § Hence, we can find the solutions of the sequen- 



tial game by optimizing a sequence of N linear programs. Of course, 



the amount of computational effort required by this method will usually 



be substantially less than the amount required to solve the sequential 



game in direct normal form. 



a 



^ /v A A 

Notice that by definition of X^ we have X^ = X = ( X^ , X^ 

2 

See Charnes and Cooper [3]. 



I * • • I 
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4. THE INFINITE SEQUENTIAL GAME 



4. 1 Formulation 

In this section, we- allow an infinite number of moves in the sequential 
game. Before giving an analytic formulation, we discuss some of the 
features of the game. In the infinite sequential game, we do not assume 
a maximum number of moves. The continuation probabilities alone con- 
trol the termination of the game. However, we assume that the same 
payoff matrix and the same continuation probability matrix are specified 
for all moves. We further assume that the probability of continuing until 
the next move is strictly less than one for all pairs of strategies. This 
assumption guarantees boundedness of the expected accumulated payments 
received by PI ; and it guarantees that the game terminates with proba- 
bility one, although the number of moves may not be bounded. Now we 
turn to a formal definition of the game under consideration. 

As in the previous games, we assume that a search region is speci- 
fied and that it is divided into n cells. If PI chooses cell i (i = 1, . . . , n) 
and P2 chooses cell j (j = 1 , . . . , n) on move r (r = 1 , 2, . . . ) , then PI 
receives from P2 the payoff 

.19 

a. . 

■’ll ^ 

and the game conliiues until move r + 1 with probability 

'll 



Let P be the nxn matrix P = (p. . ) and A the nxn matrix A = (a. . ) . A 

ij iJ 

is the payoff matrix, and P is the matrix of continuation probabilities for 
every move. We further assume that the game is zero sum and that PI 
is the maximizing player. 

The game which we have defined above is one of “perfect recall 11 ; 
and by Kuhn's [ 10] theorem, a “behavior strategy 11 is optimal. If a 
player uses a behavior strategy, he plays the same mixed strategy over 
the alternatives in an information set each' time the information set is 
reached, regardless of the past history of the game. Since the matrices 
A and P apply to every move, the game has only one information set. 
Therefore, a behavior strategy is simply a mixed strategy which is used 
for every move of the game. We restrict our attention to these strategies. 

Let X = (x ^ , ... > x n ) an d Y = (y^ , ... , y n ) t> e behavior strategies 
(mixed strategies over the alternatives) for PI and P2, respectively. For 
example, PI chooses alternative i with probability x. on every move. The 
expected accumulated payment received by PI, v(X, Y) , when PI 
chooses X and P2 chooses Y, is simply the sum over all r of the proba- 
bility that the game lasts until move r times the payment to PI for move r. 



(7) 



v(X, Y) = r (X t PY) r X*AY . 
r = 0 



The above sum converges, since (6) implies 0 £ X P Y < 1 for all strate- 



gies X an<J Y. For convenience, we define the matrix Q = (q. . ) with 

q. . = 1 - p. . all i , j . Then Q is the matrix of positive termination 
ij Hj r 

t. 

i 

) 
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probabilities. Equation (7) may be written as 



( 8 ) 



v (X , Y) 



X t A Y 
1 -X l PY 



X l AY 
X t Q Y 



von Neumann [20] first established the existence of a unique value v and 

A A 

optimal strategies X and Y for the form in (8), i. e. , there exists a unique 

A A 

real number v and strategies X, Y such that 



(9) 



X A Y X A Y v ^ 

— • £ v ^ — all strategies X, Y . 

X t QY X t QY 



An elementary proof of this fact was subsequently given by Loomis [ 11] , 
and this result is a special case of Shapley l s [17] more general 
n stochastic game". Neuts [13] formulated and solved a special case of 
the infinite sequential game. His P matrix was a diagonal matrix and 
his A matrix also had a special form. 



f 

'Y 



i< 

a 




i 



! 
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4. 2 Solution by Perturbation 



There are no known methods for computing a solution to equation (9). 
In this section, we develop a computational method to approximate a 
solution to any desired degree to accuracy. 'The method is based on a 
linear programming formulation of a matrix game with an unknown para- 
meter in the constraints. We show that this parameter is equal to the 
value of the game if and only if the optimal objective function of the linear 
program is zero. The remainder of our discussion is then devoted to a 
method for approximating the required value of the parameter. 

To begin, we establish a lemma which relates the solution of the 
infinite sequential game to the solution of an ordinary two-person zero- 
sum game. 

Lemma : A necessary and sufficient condition for v to be the value 

A A 

of the infinite sequential game and X, Y optimal strategies is that the 



two-person zero-sum game with payoff matrix A - vQ-has value zero 

A A 

and optimal strategies X, Y . 



Proof : For the matrix game A - vQ to have value zero and optimal 

A A 

strategies X, Y, it is necessary and sufficient that 



(10) X (A - v Q) Y £ 0 £ X^(A - v Q) Y all strategies X, Y . 
t 

But, X QY > 0 for all strategies, X, Y . Hence, v, X , Y satisfy (10) 



if and only if 



(10a) 





all strategies X , Y . 



t 

X ,QY 
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Equation (10a) is a necessary and sufficient condition for v to be the value 

/S A 

of the infinite sequential game and X, Y optimal strategies. Hence, the 
lemma is true. 



This lemma immediately suggests a method for computing v. The 
general procedure is to choose a number s and compute the value of the 
game A - s Q, If the value of A - s Q is zero, then s = v and we are 
finished. If the value of A - s Q is not zero, then we want to choose a 
new value of s, say s^ , such that the value of A - s^Q is "closer 11 to 
zero than the value of A - s Q , We begin by formulating the matrix game 
A - s Q as a linear program. 

Consider the linear program 



Max u 



(ii) 



u e l - X 1 (A - s Q) £ 0 
s 

X t e = 1 



X ^ 0 



where e is the nxl vector of all "ones", X is an nxl vector, s is a fixed 

scalar, and u is a scalar variable. Let u , X be an optimal solution to 
s . s 

A 

(11). Then from Charnes [2], X is an optimal strategy for PI and u is 

s 

the value of the game A - s Q, (s fixed). Of course, an optimal strategy 
* , ), 

Y for P2 is part of an optimal solution to the dual of (11), and Y is avail- 



able' when (11) is solve< 



by the simplex method. 



Next, we examine jie variation in u which results from a change 

l * s & 



in s. This will allow us to perturb s in such a way that we move u^ 

closer to zero. We consider a perturbation from s to s + § in problem 

(11), and we want to relate u to u _ . We add and subtract the vector * 
' s s + § 

§X^Q from the constraints of (11) and obtain' the following equivalent 
linear program 



Max u 



( 12 ) 



u e* - X t (A - (s + l) Q) - 5X t Q £ 0 
s 



X e = 1 



X s 0 



We seek to obtain a linear programming formulation of the game 

A - ( s + §) Q from (12). Hence, we let 7? = maX q. . , q = m * n q. . 

i , J i , J • ij 

and then for § > 0 

t t — t 

(13) § q e all strategies X . 



Now consider the following linear program 



Max u 7 



(14) 



u' e t - X 1 (A - (s + g) Q) <; § qe‘ 



X e = 1 



X SO. 



Problem (1,4) is "less constrained" than (12). Therefore, the respective 

optimal solutions must satisfy ("hats" on the variables denote optimal 

| I 

values) M ') 



I 

■\ 

;l, 




A / «*> 
u s u 



22 



( 15 ) 



Notice that the right-hand side of the constraints in (14) is a con- 
stant vector. We bring this vector over to the left-hand side of the 
constraints and make the change of variable 
(16) u = u'-5q 



to obtain the program 

Max (u + § q~) 

(17) 



ue t -X t (A-(s + g)Q) £ 0 



X e = 1 



s 0 



But, (17) is the desired linear programming formulation of the game 
A - (s + g) Q except for the additive constant + £q in the objective 
function. Hence, 



u = u 



and 



a / 

u = u 



s + § 

From (15) and the above equation 



s + § 

+ 5 q • 



" /N 

u s + 5 + 5 q i • 



By using the left-hand side of (13), we get by a similar argument 



u s + 5 + 5 q 4 u s 



Thus, for § > 0 
(18) u - § q" £ u 



s + 5 u s • 5 q • 5 > 0 



and for § < 0 we can derive the reh\tionship 



(19) v 5 q ^ “ s + 5 ■ 5 q ’ § < 0 ■ 

Equations (18) and (19) give the desired^relationships. We can choose 

a starting value of s and then subsequently perturb u toward zero. 

s 

To assist in choosing a starting value of s, we propose the following 

method. Two numbers, m and M (m £ M), are determined such that 

u a 0 and u „ £ 0. Then since u is a continuous function of s , ^ 

m M s 

u =0 for some s in the range m £ s <: M. Furthermore, this value 
s 

of s is unique. We would then choose the initial value of s to satisfy 
m £ s ^ M. Suppose we take m and M to be 



(20) 

then 



m 



min 

1 . j 




M 



max 

i. j 



a. . 
ij 




> 



mq.. £ a.. , Mq.. a a.. alii, j . 

ij ij ij ij 



From the constraints of (11), we see that 



n 

„ mm _ „ , 

u = . L x. a,. ■ sq., ; 

S J i_i 1 l J ij 



thus, 



A A A _ 

u a 0 , u _ , £ 0 
m M 



With certain restrictions on the elements a.. , we can derive tighter 

ij 

bounds than m and M ; but, the bounds given here are adequate for most 
applications. 



1 



This fact is clear /'rom the foregoing derivation. 



i 
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4. 3 Iterative Method 



There are various methods which can be used in connection with 

equations (18) and (19) to move from one value of s to another. We 

propose one method here which will automatically change s and drive 

u to zero. We find two numbers, b and § , for fixed s which will give 
s 



(21) -b £ u , * b . 

s + § 

By reference to (18) and (19), we find that 

2G |u | ( I - q ) 

(22) § = , • b = _ 

q + q q + q 



The following method for approximating a solution to equation (9) can 

now be started. For convenience, let u. be the optimal solution to (11) 

© 1 

when s. is the value of s in the constraint set. 
l 

Approximation Method 

1. Choose a starting value s^ and calculate u^ from (11). 

2. Given s. and u. , let 

i l 



i + 1 



s. 

i 



2 u. 
i 

q+q 



and calculate u. ^ from (11). 



Let U = ( Uj , u^ , . . . ) be the sequence generated by the above 

method. We now show that this sequence does indeed converge absolutely 



to zero. 

By step' 2 and ( 



luations (21) and (22), we have 

> 

i 
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notice that 0 £ a < 1 , and we also have 




Since a 11 — > 0,' we have Ju | — > 0. Because of this convergence, we 
can approximate the value of the game to any desired accuracy by the 
above method. 

We point out a few interesting features of the proposed method. 

From step 2, the ’’driving force” which moves us from s^ to + ^ is 
directly proportional to the magnitude of u. . This feature serves to 
drive u. rapidly toward zero. We also observe that when "q = *q, we will 
have u^ = 0 . This is a nice feature since ~q = qf implies that all elements 
of the Q matrix are equal and, therefore, the game reduces to an ordi- 
nary matrix game (see equation (8) ) . 

An alternative to the method which we have proposed here is to use 
the contraction property of the operator T which is defined as 

T a = Val [ A - a P ] 

where a is a scalar and Val[A - QfP] denotes the value of the matrix 
game A - aP. Shapley [ 17] has, shown in a more general setting that 

I i 



T a approaches the value of the infinite stochastic 'game for arbi- 
trary oi , and Charnes and Schroeder [5] have shown how to obtain a 
desired approximation to the value or. optimal strategies by using linear 



programming methods. 



4. 4 Tactical Payoffs and a Special Case 



For tactical purposes, two particular payoffs are appealing. As in 
the n-cell game, the payoff for each move may be the probability that 
PI detects P2 in that move. Then, in both sequential games, the expected 
accumulated payment received by PI will be the probability that PI de- 
tects P2 in the game. The other payoff of interest is obtained by taking 
all a. . (r) = 1 . Then PI always receives a payoff of one unit for each 
move, and the expected accumulated payment received by PI is simply 
the expected number of moves. When each move takes the same length 
of time, we may, of course, also interpret this payoff as the expected 
duration of the game. In one of the games in Charnes and Schroeder [5], 
we show how to incorporate both of these payoffs in the same formulation. 
In particular, PI attempts to maximize the probability of detection while 
constraining the expected number of moves to be no more than a specified 
number. 

Finally, we show that the infinite sequential game reduces to an 

ordinary matrix game when PI attempts to minimize the expected number 

of moves. From the above discussion, we take all a . = 1 for the ex- 

ij 

pected accumulated payment to be the expected number of moves. Then 
equation (8) becomes 

v (X , Y) = —± . 

X QY 

For search problems, PI wants to choose a mixed strategy X to 
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minimize the expected number of moves. Therefore, we seek to solve 
the equation 



(24) 



min max- 



1 



v = 



x Y x'qy 



X QY 

Clearly, v, X, and Y satisfy (24) if they satisfy 



l = max min J Q y ^ * Q - 



Hence, PI can minimax the expected number of moves by maximin - 
ing the probability that the game terminates in one move. This is a 
noteworthy feature of the infinite sequential game. Of course, in this 
case our perturbation technique is not required since we can solve the 
game directly by means of a single linear program. 
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